Semantic Topic Models: Combining Word Distributional Statistics and Dictionary Definitions
نویسندگان
چکیده
In this paper, we propose a novel topic model based on incorporating dictionary definitions. Traditional topic models treat words as surface strings without assuming predefined knowledge about word meaning. They infer topics only by observing surface word co-occurrence. However, the co-occurred words may not be semantically related in a manner that is relevant for topic coherence. Exploiting dictionary definitions explicitly in our model yields a better understanding of word semantics leading to better text modeling. We exploit WordNet as a lexical resource for sense definitions. We show that explicitly modeling word definitions helps improve performance significantly over the baseline for a text categorization task.
منابع مشابه
Gloss-Based Semantic Similarity Metrics for Predominant Sense Acquisition
In recent years there have been various approaches aimed at automatic acquisition of predominant senses of words. This information can be exploited as a powerful backoff strategy for word sense disambiguation given the zipfian distribution of word senses. Approaches which do not require manually sense-tagged data have been proposed for English exploiting lexical resources available, notably Wor...
متن کاملA New Semantics: Merging Propositional and Distributional Information
Despite hundreds of years of study on semantics, theories and representations of semantic content—the actual meaning of the symbols used in semantic propositions—remain impoverished. The traditional extensional and intensional models of semantics are difficult to actually flesh out in practice, and no large-scale models of this kind exist. Recently, researchers in Natural Language Processing (N...
متن کاملWhat Is Word Meaning, Really? (And How Can Distributional Models Help Us Describe It?)
In this paper, we argue in favor of reconsidering models for word meaning, using as a basis results from cognitive science on human concept representation. More specifically, we argue for a more flexible representation of word meaning than the assignment of a single best-fitting dictionary sense to each occurrence: Either use dictionary senses, but view them as having fuzzy boundaries, and assu...
متن کاملLearning to Understand Phrases by Embedding the Dictionary
Distributional models that learn rich semantic word representations are a success story of recent NLP research. However, developing models that learn useful representations of phrases and sentences has proved far harder. We propose using the definitions found in everyday dictionaries as a means of bridging this gap between lexical and phrasal semantics. Neural language embedding models can be e...
متن کاملImplementing a Reverse Dictionary, based on word definitions, using a Node-Graph Architecture
In this paper, we outline an approach to build graph-based reverse dictionaries using word definitions. A reverse dictionary takes a phrase as an input and outputs a list of words semantically similar to that phrase. It is a solution to the Tip-of-the-Tongue problem. We use a distance-based similarity measure, computed on a graph, to assess the similarity between a word and the input phrase. We...
متن کامل